Model distillation

The process of creating a smaller model that mimics the behavior of a much larger model.

Busbridge2025distillation studies the Scaling law in distillation.